A new fast technique for pattern matching in biological sequences
نویسندگان
چکیده
Abstract At numerous phases of the computational process, pattern matching is essential. It enables users to search for specific DNA subsequences or sequences in a database. In addition, some these rapidly expanding biological databases are updated on regular basis. Pattern searches can be improved by using high-speed algorithms. Researchers striving improve solutions areas bioinformatics as data grows exponentially. Faster algorithms with low error rate needed real-world applications. As result, this study offers two that were created help speed up sequence searches. The strategies recommended performance utilizing word-level processing rather than character-level processing, which has been used previous research studies. terms time cost, proposed (EFLPM and EPAPM) increased leveraging large size. experimental results show methods faster other short long patterns. EFLPM algorithm 54% FLPM method, while EPAPM 39% PAPM method.
منابع مشابه
A Procedure for Biological Sensitive Pattern Matching in Protein Sequences
A Procedure for fast pattern matching in protein sequences is presented. It uses a biological metric, based on the substitution matrices as PAM or BLOSUM, to compute the matching. Biological sensitive pattern matching does pattern detection according to the available empirical data about similarity and affinity relations between amino acids in protein sequences. Sequence alignments is a string ...
متن کاملExperimental Results on Multiple Pattern Matching Algorithms for Biological Sequences
With the remarkable increase in the number of DNA and proteins sequences, it is very important to study the performance of multiple pattern matching algorithms when querying sequence patterns in biological sequence databases. In this paper, we present a performance study of the running time of well known multiple pattern matching algorithms on widely used biological sequence databases containin...
متن کاملA New String Matching Algorithm for Searching Biological Sequences
String matching algorithms play a key role in many computer science problems, and in the implementation of computer software. This problem has received, and continues to receive a great deal of attention due to various applications in text manipulation, information retrieval, speech recognition, image and signal processing and computational biology. In this study, we propose a new algorithm cal...
متن کاملFast Practical Exact and Approximate Pattern Matching in Protein Sequences
Here we design, analyse and implement an algorithm that searches for motifs in protein sequences using masking techniques (“wordlevel” parrallelism). Our algorithm speeds up known algorithms by a factor of 20 (or the alphabet size). Furthermore, we present graphs of the running times of the algorithm in comparison to its theoritical time complexity.
متن کاملA New Approach to Pattern Matching in Degenerate DNA/RNA Sequences and Distributed Pattern Matching
In this paper, we consider the pattern matching problem in DNA and RNA sequences where either the pattern or the text can be degenerate i.e. contain sets of characters. We present an asymptotically faster algorithm for the above problem that works in O(n logm) time, where n and m is the length of the text and the pattern respectively. We also suggest an efficient implementation of our algorithm...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: The Journal of Supercomputing
سال: 2022
ISSN: ['0920-8542', '1573-0484']
DOI: https://doi.org/10.1007/s11227-022-04673-3